CQFD - 2012 - Annual activity report

CQFD

CQFD - 2012

Project-Team Cqfd

Members

Overall Objectives

Scientific Foundations

Application Domains

Dependability and safety

New Results

Bilateral Contracts and Grants with Industry

Partnerships and Cooperations

Dissemination

Bibliography

Previous |

Home | Next next

Section: New Results

Optimal stopping for partially observed piecewise-deterministic Markov processes

Participants : Adrien Brandejsky, Benoîte de Saporta, François Dufour.

We have investigated an optimal stopping problem under partial observation for piecewise-deterministic Markov processes (PDMP) both from the theoretical and numerical points of view. PDMP's have been introduced by Davis [73] as a general class of stochastic models. They form a family of Markov processes involving deterministic motion punctuated by random jumps. One important property of a PDMP, relevant for the approach developed in this paper, is that its distribution is completely characterized by the embedded discrete time Markov chain ${(Z_{n}, S_{n})}_{n \in ℕ}$ where $Z_{n}$ is the $n$ -th post-jump location and $S_{n}$ is the $n$ -th inter-jump time. We consider the following optimal stopping problem for a partially observed PDMP ${(X_{t})}_{t \geq 0}$ . Roughly speaking, the observation process ${(Y_{t})}_{t \geq 0}$ is a point process defined through the embedded discrete time Markov chain ${(Z_{n}, S_{n})}_{n \in ℕ}$ . The inter-arrival times are given by ${(S_{n})}_{n \in ℕ}$ and the marks by a noisy function of ${(Z_{n})}_{n \in ℕ}$ . For a given reward function $g$ and a computation horizon $N \in ℕ$ , we study the following optimal stopping problem

sup_{σ \leq T_{N}} 𝔼 [g (X_{σ})],

where $T_{N}$ is the $N$ -th jump time of the PDMP ${(X_{t})}_{t \geq 0}$ , $σ$ is a stopping time with respect to the natural filtration $ℱ^{o} = {(ℱ_{t}^{o})}_{t \geq 0}$ generated by the observations ${(Y_{t})}_{t \geq 0}$ .

A general methodology to solve such a problem is to split it into two sub-problems. The first one consists in deriving the filter process given by the conditional expectation of $X_{t}$ with respect to the observed information $ℱ_{t}^{o}$ . Its main objective is to transform the initial problem into a completely observed optimal stopping problem where the new state variable is the filter process. The second step consists in solving this reformulated problem, the new difficulty being its infinite dimension. Indeed, the filter process takes values in a set of probability measures.

Our work is inspired by [92] which deals with an optimal stopping problem under partial observation for a Markov chain with finite state space. The authors study the optimal filtering and convert their original problem into a standard optimal stopping problem for a continuous state space Markov chain. Then they propose a discretization method based on a quantization technique to approximate the value function. However, their method cannot be directly applied to our problem for the following main reasons related to the specificities of PDMPs.

Firstly, PDMPs are continuous time processes. Then, it appears natural to work with the embedded Markov chain ${(Z_{n}, S_{n})}_{n \in ℕ}$ . In addition, we assume that ${(Z_{n})}_{n \in ℕ}$ takes finitely many values. However, an important difficulty is that the structure of stopping time remains intrinsically continuous. Consequently, our problem cannot be converted into a fully discrete time problem.

Secondly, the distribution of a PDMP combines both absolutely continuous and singular components. This is due to the existence of forced jumps when the process hits the boundary of the state space. As a consequence the derivation of the filter process is not straightforward. In particular, the absolute continuity hypothesis (H) of [92] does not hold.

Thirdly, in our context the reformulated optimization problem is not standard, unlike in [92] . Indeed, although we obtain a reformulation similar to an optimal stopping problem for a fully observed PDMP, it involves the Markov chain ${(Π_{n}, S_{n})}_{n \in ℕ}$ that is not the embedded Markov chain of some PDMP. Therefore, a new derivation of dynamic programming equations is required as we cannot use the results of [81] . In particular, one needs to derive fine properties of the structure of the ${(ℱ_{t}^{o})}_{t \geq 0}$ -stopping times. Moreover, we construct an $ϵ$ -optimal stopping time.

Finally, a natural way to proceed with the numerical approximation is then to follow the ideas developed in [92] [8] namely to replace the filter $Π_{n}$ and the inter-jump time $S_{n}$ by some finite state space approximations in the dynamic programming equation. However, a noticeable difference from [8] lies in the fact that the dynamic programming operators therein were Lipschitz continuous whereas our new operators are only Lipschitz continuous between some points of discontinuity. We overcome this drawback by splitting the operators into their restrictions onto their continuity sets. This way, we obtain not only an approximation of the value function of the optimal stopping problem but also an $ϵ$ -optimal stopping time with respect to the filtration ${(ℱ_{t}^{o})}_{t \geq 0}$ that can be computed in practice.

This work is submitted for publication [60] and presented in an invited international conference [26] .

Previous |

Home | Next next